Spatio-temporal Human poSe Detection
نویسنده
چکیده
This thesis proposes a Chamfer-based method for human body pose detection that combines silhouette matching, motion information, and statistical relevance estimates in an original way. We demonstrate that our method can not only detect people but also recover their full 3D pose when they are seen from different viewpoints and at different scales, when the background is cluttered and background subtraction is impractical because camera moves. We introduce spatio-temporal templates that consist of short sequences of 2D silhouettes obtained from motion capture data. This way, the motion information is inherent in the templates, which is important because human motion is very different from other kinds of motions and can be effectively used to distinguish humans from both static and moving background objects. The templates can handle different camera views, as well as different scales. They are matched against short image sequences. During a training phase, we use statistical learning techniques to estimate and store the relevance of the different silhouette parts to the recognition task. At run-time, we use it to convert Chamfer distances into meaningful probability estimates. For example, for walking motions, this accounts for the fact that feet and shoulders provide much more discriminant information than the trunk. Using the probability estimates makes the recognition algorithm much more discriminating. To demonstrate our approach we chose two types of motion: walking and golf swings. All the walking templates represent the specific part on the walking cycle where the feet are on the ground and the angle between the legs is greatest. The characteristic pose that we have chosen for a golf swing is the beginning of the downswing, when the arms are at the highest position. To further improve the performance of our algorithm we use dynamic programming to link the various detections of walking people to create plausible trajectories. We filter out the detections which are not lying on the recovered trajectory. This way, all detections whose orientation is wrong are eliminated as well as false positives on the background, if any. Finally, we show that the reliable specific 3D poses provided by our approach, allow us to treat the 3D tracking of human motion as an interpolation problem, which unlike traditional tracking approaches is both robust and fully automated.
منابع مشابه
Modeling and Spatio-Temporal Analysis of the Distribution of O3 in Tehran City Based on Neural Network and Spatial Analysis in GIS Environment
Air pollution is one of the most problems that people are facing today in metropolitan areas. Suspended particulates, carbon monoxide, sulfur dioxide, ozone and nitrogen dioxide are the five major pollutants of air that pose many problems to human health. The goal of this study is to propose a spatial approach for estimation and analyzing the spatial and temporal distribution of ozone based on ...
متن کاملMASTER IN COMPUTER VISION AND ARTIFICIAL INTELLIGENCE REPORT OF THE RESEARCH PROJECT OPTION: COMPUTER VISION Pose and Face Recovery via Spatio-temporal GrabCut Human Segmentation
In this paper, we present a full-automatic Spatio-Temporal GrabCut human segmentation methodology which benefits from the combination of tracking and segmentation. GrabCut initialization is performed by a HOG-based subject detection, face detection, and skin color model for seed initialization. Spatial information is included by means of Mean Shift clustering whereas temporal coherence is consi...
متن کاملGrabCut-Based Human Segmentation in Video Sequences
In this paper, we present a fully-automatic Spatio-Temporal GrabCut human segmentation methodology that combines tracking and segmentation. GrabCut initialization is performed by a HOG-based subject detection, face detection, and skin color model. Spatial information is included by Mean Shift clustering whereas temporal coherence is considered by the historical of Gaussian Mixture Models. Moreo...
متن کاملSpatio-Temporal Variation of Suspended Sediment Concentration at Downstream of a Sand Mine
The growing population led to greater human need to use natural resources such as sand and gravel mines. Direct removal of sands from the bed river leads to increase suspended sediment concentrations in downstream of harvested area and creates other problems viz. filling reservoirs, change in hydraulic characteristics of the channel and environmental damages. However, the range of temporal and ...
متن کاملA Data-driven Approach for Human Pose Tracking Based on Spatio-temporal Pictorial Structure
In this paper, we present a data-driven approach for human pose tracking in video data. We formulate the human pose tracking problem as a discrete optimization problem based on spatio-temporal pictorial structure model and solve this problem in a greedy framework very efficiently. We propose the model to track the human pose by combining the human pose estimation from single image and tradition...
متن کاملExploiting Spatio-temporal Constraints for Robust 2D Pose Tracking
We present a Spatio-temporal 2D Models Framework (STMF) for 2D-Pose tracking. Space and time are discretized and a mixture of probabilistic “local models” is learnt associating 2D Shapes and 2D Stick Figures. Those spatio-temporal models generalize well for a particular viewpoint and state of the tracked action but some spatiotemporal discontinuities can appear along a sequence, as a direct con...
متن کامل